{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Running a Tempo pipeline in MLServer\n",
    "\n",
    "This example walks you through how to create and serialise a [Tempo pipeline](https://github.com/SeldonIO/tempo), which can then be served through MLServer.\n",
    "This pipeline can contain custom Python arbitrary code."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating the pipeline\n",
    "\n",
    "The first step will be to create our Tempo pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import os\n",
    "\n",
    "from tempo import ModelFramework, Model, Pipeline, pipeline\n",
    "from tempo.seldon import SeldonDockerRuntime\n",
    "from tempo.kfserving import KFServingV2Protocol\n",
    "\n",
    "\n",
    "MODELS_PATH = os.path.join(os.getcwd(), 'models')\n",
    "\n",
    "docker_runtime = SeldonDockerRuntime()\n",
    "\n",
    "sklearn_iris_path = os.path.join(MODELS_PATH, 'sklearn-iris')\n",
    "sklearn_model = Model(\n",
    "    name=\"test-iris-sklearn\",\n",
    "    runtime=docker_runtime,\n",
    "    platform=ModelFramework.SKLearn,\n",
    "    uri=\"gs://seldon-models/sklearn/iris\",\n",
    "    local_folder=sklearn_iris_path,\n",
    ")\n",
    "\n",
    "xgboost_iris_path = os.path.join(MODELS_PATH, 'xgboost-iris')\n",
    "xgboost_model = Model(\n",
    "    name=\"test-iris-xgboost\",\n",
    "    runtime=docker_runtime,\n",
    "    platform=ModelFramework.XGBoost,\n",
    "    uri=\"gs://seldon-models/xgboost/iris\",\n",
    "    local_folder=xgboost_iris_path,\n",
    ")\n",
    "\n",
    "inference_pipeline_path = os.path.join(MODELS_PATH, 'inference-pipeline')\n",
    "@pipeline(\n",
    "    name=\"inference-pipeline\",\n",
    "    models=[sklearn_model, xgboost_model],\n",
    "    runtime=SeldonDockerRuntime(protocol=KFServingV2Protocol()),\n",
    "    local_folder=inference_pipeline_path\n",
    ")\n",
    "def inference_pipeline(payload: np.ndarray) -> np.ndarray:\n",
    "    res1 = sklearn_model(payload)\n",
    "    if res1[0][0] > 0.7:\n",
    "        return res1\n",
    "    else:\n",
    "        return xgboost_model(payload)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This pipeline can then be serialised using `cloudpickle`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "inference_pipeline.save(save_env=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Serving the pipeline\n",
    "\n",
    "Once we have our pipeline created and serialised, we can then create a `model-settings.json` file.\n",
    "This configuration file will hold the configuration specific to our MLOps pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting ./model-settings.json\n"
     ]
    }
   ],
   "source": [
    "%%writefile ./model-settings.json\n",
    "{\n",
    "    \"name\": \"inference-pipeline\",\n",
    "    \"implementation\": \"tempo.mlserver.InferenceRuntime\",\n",
    "    \"parameters\": {\n",
    "        \"uri\": \"./models/inference-pipeline\"\n",
    "    }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Start serving our model\n",
    "\n",
    "Now that we have our config in-place, we can start the server by running `mlserver start .`. This needs to either be ran from the same directory where our config files are or pointing to the folder where they are.\n",
    "\n",
    "```shell\n",
    "mlserver start .\n",
    "```\n",
    "\n",
    "Since this command will start the server and block the terminal, waiting for requests, this will need to be ran in the background on a separate terminal."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Deploy our pipeline components\n",
    "\n",
    "Additionally, we will also need to deploy our pipeline components.\n",
    "That is, the SKLearn and XGBoost models.\n",
    "We can do that as:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:tempo:Deploying test-iris-sklearn\n",
      "INFO:tempo:Deploying test-iris-xgboost\n",
      "INFO:tempo:Deploying inference-pipeline\n"
     ]
    }
   ],
   "source": [
    "inference_pipeline.deploy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Send test inference request\n",
    "\n",
    "We now have our model being served by `mlserver`.\n",
    "To make sure that everything is working as expected, let's send a request.\n",
    "\n",
    "For that, we can use the Python types that `mlserver` provides out of box, or we can build our request manually."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'model_name': 'inference-pipeline',\n",
       " 'model_version': None,\n",
       " 'id': '4a13e269-b5ea-4270-8568-cc0819dc435d',\n",
       " 'parameters': None,\n",
       " 'outputs': [{'name': 'output0',\n",
       "   'shape': [1, 3],\n",
       "   'datatype': 'FP64',\n",
       "   'parameters': None,\n",
       "   'data': [0.8260199086532507, 0.038597383243756514, 0.1353827081029928]}]}"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import requests\n",
    "\n",
    "x_0 = np.array([[0.1, 3.1, 1.5, 0.2]])\n",
    "inference_request = {\n",
    "    \"inputs\": [\n",
    "        {\n",
    "          \"name\": \"predict\",\n",
    "          \"shape\": x_0.shape,\n",
    "          \"datatype\": \"FP32\",\n",
    "          \"data\": x_0.tolist()\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "endpoint = \"http://localhost:8080/v2/models/inference-pipeline/infer\"\n",
    "response = requests.post(endpoint, json=inference_request)\n",
    "\n",
    "response.json()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
